102 research outputs found

    EC-BLAST: a tool to automatically search and compare enzyme reactions.

    Get PDF
    We present EC-BLAST (http://www.ebi.ac.uk/thornton-srv/software/rbl/), an algorithm and Web tool for quantitative similarity searches between enzyme reactions at three levels: bond change, reaction center and reaction structure similarity. It uses bond changes and reaction patterns for all known biochemical reactions derived from atom-atom mapping across each reaction. EC-BLAST has the potential to improve enzyme classification, identify previously uncharacterized or new biochemical transformations, improve the assignment of enzyme function to sequences, and assist in enzyme engineering

    Confab - Systematic generation of diverse low-energy conformers

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Many computational chemistry analyses require the generation of conformers, either on-the-fly, or in advance. We present Confab, an open source command-line application for the systematic generation of low-energy conformers according to a diversity criterion.</p> <p>Results</p> <p>Confab generates conformations using the 'torsion driving approach' which involves iterating systematically through a set of allowed torsion angles for each rotatable bond. Energy is assessed using the MMFF94 forcefield. Diversity is measured using the heavy-atom root-mean-square deviation (RMSD) relative to conformers already stored. We investigated the recovery of crystal structures for a dataset of 1000 ligands from the Protein Data Bank with fewer than 1 million conformations. Confab can recover 97% of the molecules to within 1.5 Å at a diversity level of 1.5 Å and an energy cutoff of 50 kcal/mol.</p> <p>Conclusions</p> <p>Confab is available from <url>http://confab.googlecode.com</url>.</p

    AZOrange - High performance open source machine learning for QSAR modeling in a graphical programming environment

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Machine learning has a vast range of applications. In particular, advanced machine learning methods are routinely and increasingly used in quantitative structure activity relationship (QSAR) modeling. QSAR data sets often encompass tens of thousands of compounds and the size of proprietary, as well as public data sets, is rapidly growing. Hence, there is a demand for computationally efficient machine learning algorithms, easily available to researchers without extensive machine learning knowledge. In granting the scientific principles of transparency and reproducibility, Open Source solutions are increasingly acknowledged by regulatory authorities. Thus, an Open Source state-of-the-art high performance machine learning platform, interfacing multiple, customized machine learning algorithms for both graphical programming and scripting, to be used for large scale development of QSAR models of regulatory quality, is of great value to the QSAR community.</p> <p>Results</p> <p>This paper describes the implementation of the Open Source machine learning package AZOrange. AZOrange is specially developed to support batch generation of QSAR models in providing the full work flow of QSAR modeling, from descriptor calculation to automated model building, validation and selection. The automated work flow relies upon the customization of the machine learning algorithms and a generalized, automated model hyper-parameter selection process. Several high performance machine learning algorithms are interfaced for efficient data set specific selection of the statistical method, promoting model accuracy. Using the high performance machine learning algorithms of AZOrange does not require programming knowledge as flexible applications can be created, not only at a scripting level, but also in a graphical programming environment.</p> <p>Conclusions</p> <p>AZOrange is a step towards meeting the needs for an Open Source high performance machine learning platform, supporting the efficient development of highly accurate QSAR models fulfilling regulatory requirements.</p

    Open Babel: An open chemical toolbox

    Get PDF
    Background: A frequent problem in computational modeling is the interconversion of chemical structures between different formats. While standard interchange formats exist (for example, Chemical Markup Language) and de facto standards have arisen (for example, SMILES format), the need to interconvert formats is a continuing problem due to the multitude of different application areas for chemistry data, differences in the data stored by different formats (0D versus 3D, for example), and competition between software along with a lack of vendorneutral formats. Results: We discuss, for the first time, Open Babel, an open-source chemical toolbox that speaks the many languages of chemical data. Open Babel version 2.3 interconverts over 110 formats. The need to represent such a wide variety of chemical and molecular data requires a library that implements a wide range of cheminformatics algorithms, from partial charge assignment and aromaticity detection, to bond order perception and canonicalization. We detail the implementation of Open Babel, describe key advances in the 2.3 release, and outline a variety of uses both in terms of software products and scientific research, including applications far beyond simple format interconversion. Conclusions: Open Babel presents a solution to the proliferation of multiple chemical file formats. In addition, it provides a variety of useful utilities from conformer searching and 2D depiction, to filtering, batch conversion, and substructure and similarity searching. For developers, it can be used as a programming library to handle chemical data in areas such as organic chemistry, drug design, materials science, and computational chemistry. It is freely available under an open-source license fro

    Quantitative global studies of reactomes and metabolomes using a vectorial representation of reactions and chemical compounds

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Global studies of the protein repertories of organisms are providing important information on the characteristics of the protein space. Many of these studies entail classification of the protein repertory on the basis of structure and/or sequence similarities. The situation is different for metabolism. Because there is no good way of measuring similarities between chemical reactions, there is a barrier to the development of global classifications of "metabolic space" and subsequent studies comparable to those done for protein sequences and structures.</p> <p>Results</p> <p>In this work, we propose a vectorial representation of chemical reactions, which allows them to be compared and classified. In this representation, chemical compounds, reactions and pathways may be represented in the same vectorial space. We show that the representation of chemical compounds reflects their physicochemical properties and can be used for predictive purposes. We use the vectorial representations of reactions to perform a global classification of the reactome of the model organism <it>E. coli</it>.</p> <p>Conclusions</p> <p>We show that this unsupervised clustering results in groups of enzymes more coherent in biological terms than equivalent groupings obtained from the EC hierarchy. This hierarchical clustering produces an optimal set of 21 groups which we analyzed for their biological meaning.</p

    Hydrophobicity and Charge Shape Cellular Metabolite Concentrations

    Get PDF
    What governs the concentrations of metabolites within living cells? Beyond specific metabolic and enzymatic considerations, are there global trends that affect their values? We hypothesize that the physico-chemical properties of metabolites considerably affect their in-vivo concentrations. The recently achieved experimental capability to measure the concentrations of many metabolites simultaneously has made the testing of this hypothesis possible. Here, we analyze such recently available data sets of metabolite concentrations within E. coli, S. cerevisiae, B. subtilis and human. Overall, these data sets encompass more than twenty conditions, each containing dozens (28-108) of simultaneously measured metabolites. We test for correlations with various physico-chemical properties and find that the number of charged atoms, non-polar surface area, lipophilicity and solubility consistently correlate with concentration. In most data sets, a change in one of these properties elicits a ∼100 fold increase in metabolite concentrations. We find that the non-polar surface area and number of charged atoms account for almost half of the variation in concentrations in the most reliable and comprehensive data set. Analyzing specific groups of metabolites, such as amino-acids or phosphorylated nucleotides, reveals even a higher dependence of concentration on hydrophobicity. We suggest that these findings can be explained by evolutionary constraints imposed on metabolite concentrations and discuss possible selective pressures that can account for them. These include the reduction of solute leakage through the lipid membrane, avoidance of deleterious aggregates and reduction of non-specific hydrophobic binding. By highlighting the global constraints imposed on metabolic pathways, future research could shed light onto aspects of biochemical evolution and the chemical constraints that bound metabolic engineering efforts

    ST3Gal.I sialyltransferase relevance in bladder cancer tissues and cell lines

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The T antigen is a tumor-associated structure whose sialylated form (the sialyl-T antigen) involves the altered expression of sialyltransferases and has been related with worse prognosis. Since little or no information is available on this subject, we investigated the regulation of the sialyltransferases, able to sialylate the T antigen, in bladder cancer progression.</p> <p>Methods</p> <p>Matched samples of urothelium and tumor tissue, and four bladder cancer cell lines were screened for: <it>ST3Gal.I</it>, <it>ST3Gal.II </it>and <it>ST3Gal.IV </it>mRNA level by real-time PCR. Sialyl-T antigen was detected by dot blot and flow cytometry using peanut lectin. Sialyltransferase activity was measured against the T antigen in the cell lines.</p> <p>Results</p> <p>In nonmuscle-invasive bladder cancers, <it>ST3Gal.I </it>mRNA levels were significantly higher than corresponding urothelium (p < 0.001) and this increase was twice more pronounced in cancers with tendency for recurrence. In muscle-invasive cancers and matching urothelium, <it>ST3Gal.I </it>mRNA levels were as elevated as nonmuscle-invasive cancers. Both non-malignant bladder tumors and corresponding urothelium showed <it>ST3Gal.I </it>mRNA levels lower than all the other specimen groups. A good correlation was observed in bladder cancer cell lines between the <it>ST3Gal.I </it>mRNA level, the ST activity (r = 0.99; p = 0.001) and sialyl-T antigen expression, demonstrating that sialylation of T antigen is attributable to ST3Gal.I. The expression of sialyl-T antigens was found in patients' bladder tumors and urothelium, although without a marked relationship with mRNA level. The two <it>ST3Gal.I </it>transcript variants were also equally expressed, independently of cell phenotype or malignancy.</p> <p>Conclusion</p> <p>ST3Gal.I plays the major role in the sialylation of the T antigen in bladder cancer. The overexpression of <it>ST3Gal.I </it>seems to be part of the initial oncogenic transformation of bladder and can be considered when predicting cancer progression and recurrence.</p

    A network-based target overlap score for characterizing drug combinations: High correlation with cancer clinical trial results

    Get PDF
    Drug combinations are highly efficient in systemic treatment of complex multigene diseases such as cancer, diabetes, arthritis and hypertension. Most currently used combinations were found in empirical ways, which limits the speed of discovery for new and more effective combinations. Therefore, there is a substantial need for efficient and fast computational methods. Here, we present a principle that is based on the assumption that perturbations generated by multiple pharmaceutical agents propagate through an interaction network and can cause unexpected amplification at targets not immediately affected by the original drugs. In order to capture this phenomenon, we introduce a novel Target Overlap Score (TOS) that is defined for two pharmaceutical agents as the number of jointly perturbed targets divided by the number of all targets potentially affected by the two agents. We show that this measure is correlated with the known effects of beneficial and deleterious drug combinations taken from the DCDB, TTD and Drugs.com databases. We demonstrate the utility of TOS by correlating the score to the outcome of recent clinical trials evaluating trastuzumab, an effective anticancer agent utilized in combination with anthracycline- and taxane-based systemic chemotherapy in HER2-receptor (erb-b2 receptor tyrosine kinase 2) positive breast cancer. © 2015 Ligeti et al
    corecore